


»\ The ° ADVANCED 
ay, College AP PLACEMENT 
Board PROGRAM’ 


AP® Statistics 
2002 Scoring Guidelines 











The materials included in these files are intended for use by AP teachers for course 
and exam preparation in the classroom; permission for any other use must be 
sought from the Advanced Placement Program®. Teachers may reproduce them, in 
whole or in part, in limited quantities, for face-to-face teaching purposes but may 


not mass distribute the materials, electronically or otherwise. These materials and 
any copies made of them may not be resold, and the copyright notices must be 
retained as they appear here. This permission does not apply to any third-party 
copyrights contained herein. 





These materials were produced by Educational Testing Service® (ETS®), which develops and administers the examinations of the Advanced Placement 
Program for the College Board. The College Board and Educational Testing Service (ETS) are dedicated to the principle of equal opportunity, and their 
programs, services, and employment policies are guided by that principle. 


The College Board is a national nonprofit membership association dedicated to preparing, inspiring, and connecting students to college and opportunity. 
Founded in 1900, the association is composed of more than 4,200 schools, colleges, universities, and other educational organizations. Each year, the 
College Board serves over three million students and their parents, 22,000 high schools, and 3,500 colleges, through major programs and services in 

college admission, guidance, assessment, financial aid, enrollment, and teaching and learning. Among its best-known programs are the SAT®, the 
PSAT/NMSQT®, and the Advanced Placement Program® (AP®). The College Board is committed to the principles of equity and 
excellence, and that commitment is embodied in all of its programs, services, activities, and concerns. 


Copyright © 2002 by College Entrance Examination Board. All rights reserved. College Board, Advanced Placement Program, AP, SAT, and the acorn logo 
are registered trademarks of the College Entrance Examination Board. APIEL is a trademark owned by the College Entrance Examination Board. PSAT/NMSQT is a 
registered trademark jointly owned by the College Entrance Examination Board and the National Merit Scholarship Corporation. 
Educational Testing Service and ETS are registered trademarks of Educational Testing Service. 


AP® STATISTICS 
2002 SCORING GUIDELINES 


Question 1 
Solution 
Part (a): 


The precision of the estimates of y has gotten better over time. This is indicated by the 
fact that the intervals 
value + (margin of error) 


shown in the figure become narrower over time, indicating that the margin of error is 
getting smaller. 


Part (b): 


The value of y=0 is not included in any of the 21 intervals. This indicates that 0 is not a 
plausible value for y. There is no support for Newton's theory. 


Part (c): 


The support for Einstein's theory that y =1 is quite strong. Most of the intervals contain 
pp y Y q g 


the value 1, and the more recent intervals, where the precision is greater, suggest that the 
value of y is at least very close to 1. 


Scoring 


Part (a) is scored as incorrect (1), partially correct (P), or essentially correct (E). The 
response is essentially correct if the student indicates that precision is increasing. (If the 
student also explains how he or she can tell this from the figure, this is a plus.) 


If the student incorrectly says that precision is decreasing, but gives a good explanation 
that is tied to the figure, the response is scored as partially correct. 


Part (b) is scored as incorrect (I), partially correct (P), or essentially correct (E). To be 
scored as essentially correct, the response must say that there is no (or very weak) support 
and give a valid reason for this conclusion based on the intervals in the figure. 


If the student states only that there is no (or weak) support, but does not say how this 
follows from the intervals in the figure, the response is scored as partially correct. 


Part (c) is scored as incorrect (I), partially correct (P), or essentially correct (E). To be 
scored as essentially correct, the response must say that there is strong support and give a 
valid reason for this conclusion based on the intervals in the figure. 


If the student states only that there is strong support, but does not say how this follows 
from the intervals in the figure, the response is scored as partially correct. 
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Question 1 (cont’d.) 
4 Complete Response 
All three parts essentially correct. 
3 Substantial Response 
Two parts essentially correct and one part partially correct. 
2 Developing Response 
Two parts essentially correct and no parts partially correct. 
on part essentially correct and two parts partially correct. 
OR 
Three parts partially correct. 
1 Minimal Response 
One part essentially correct and either zero or one parts partially correct. 


OR 
No parts essentially correct and two parts partially correct. 
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Question 2 
Solution 


Part (a): 
A paired design is used in which each subject receives a pair of boots where one boot is treated with the 
new method and the other with the current method. 
Subjects should be randomly assigned to one of two groups. Group | would have the new 
method applied to the right boot; group 2 would have the new method applied to the left boot. 
OR 


For each subject, whether the new method is applied to the right or left boot is determined at 
random. 
OR 
A crossover design is used in which each subject receives a pair of boots, both of which were treated 
with one treatment. The boots are used for three months and then exchanged for a second pair of boots, 
both of which were treated with the other treatment. These boots are then used for the next three months. 
Subjects should be randomly assigned to one of two groups. One group receives boots with the 
new treatment first and the other group receives boots with the current method first. 


NOTE: Additional appropriate blocking schemes are considered extraneous. 


Part (b): 
The design could be double blind, as long as both the subjects and the person evaluating the boots for 
water damage do not know which boots were treated with the new method and which were treated with 
the current method. 


NOTE: If the student does something unexpected in part (a) and gives a design that actually cannot be 
double blind, then part (b) could be considered correct provided the response explains why the design 
could not be double blind. 


Scoring 
A student response is scored as E (essentially correct), P (partially correct), or I (incorrect) for each of the 
following key elements: 
1. Design 
e E- paired design (may be described as blocking on individual) or crossover design 
e P-2 or more groups (e.g., Completely Randomized Design) 
e I -no grouping or grouping with no treatments specified 
2. Implementation: Randomization appropriate to the design 
e E - Written description of appropriate randomization 
e P - Incomplete or incorrect description of randomization 
e I - No description of randomization 
NOTE: (1) Diagram alone can be scored at most a P. 
(2) The randomization must apply to the allocation or assignment of subjects to the 
treatment groups or the allocation of treatments to the subjects. 
(3) Randomization to select the 100 volunteers without assignment to the treatment 
groups is scored an I. 
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Question 2 (cont’d.) 


3. Double blind: Explanation in parts (a) and/or (b) that shows understanding of what it means for 
an experiment to be double blind. 

e KE - response indicates that blinding applies to both the evaluator and subjects. 

e P- response recognizes that blinding applies to the subjects and at least one other party, 
whether or not they think that this can be accomplished; the other party may not be 
correctly identified. 

e I -response fails to recognize that both the subject and another party must be blinded or is 
missing or irrelevant. 

Score as Design - Randomization - Double Blind 


4 Complete Response 
EEE 
3 Substantial Response 


Any one of the following combinations: 


EEP PEE PEP* 
EEI 
EPE 

2 Developing Response 


Any one of the following combinations: 


EPP PEI IEE PEP * 
EPI PPE IPE 
EIE PPP 
EIP PIE 
1 Minimal Response 


Any one of the following combinations: 


EI! PPI IEP 
IEI PIP 
ITE IPP 
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Question 2 (cont’d.) 
0 No Credit 


PII ITI 
IPI 
ITIP 


* P E P may be scored as either a2 ora 3: 
(1) If the description of the randomization only says, “Randomly allocate”, then score P E P a2. 
(2) If the description of the randomization says, “Randomly allocate”, but also contains greater detail 
about the randomization or the inclusion of blocking in the design or other statistical thinking, then 
score PE Pa3. 
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Question 3 


Solution 
Part (a): 


For runner 3 


P(time < 4.2) = Pz < Sy = P(z<-2.14)=0.0162 (from table) 
OR 
P(time < 4.2) = 0.0160622279 (from Calculator) 


It is possible but unlikely that runner 3 will run a mile in less than 4.2 minutes on the next 
race. Based on his running time distribution, we would expect that he would have times 
less than 4.2 minutes less than 2 times in 100 races in the long run. 


OR 
It is possible but unlikely that runner 3 will run a mile in less than 4.2 minutes on the next 


race because 4.2 is more than 2 standard deviations below the mean. Since the running 
time has a normal distribution, it is unlikely to be more than 2 standard deviations below 


the mean. 
Part (b): 
Mp = by t+ My + yg +My =4.94-4.74+4.5 44.8 =18.9 


The runners’ times are independently distributed, therefore 
O7 = 0; +05 +03 +04 =(.15)* +(.16)? +(.14)? +(.15)* = 0.0902 


or =v.0902 = 0.3003 


Part (c): 


P(team time <18.4) = Pz < a2 = P(z <-1.67)=0.0475 (from table) 


OR 


P(team time < 18.4) = 0.0479561904 = (from Calculator) 
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Question 3 (cont’d.) 
Scoring 
Each part is scored as essentially correct (E), partially correct (P), or incorrect (1). 


Part (a) is essentially correct if: 
the probability is calculated correctly, it is correctly assessed as unlikely, and the 
justification is acceptable 
OR 
the student does not compute the probability, but appeals to the fact that a time of 4.2 
minutes is more than 2 standard deviations below the mean of a normal distribution and 
then uses this information to reach a conclusion with appropriate communication. 


Part (a) is partially correct if: 
the probability computed is not correct (for example, P(z > —2.14) or P(z < +2.14) might 
be computed), but the given probability is correctly assessed 
OR 
an argument is based on the number of standard deviations from the mean without 
invoking normality. 


Part (b) is essentially correct if both the mean and the standard deviation of the team time 
distribution are correctly computed (except for purely arithmetic mistakes). 


Part (b) is partially correct if only one of these is correctly computed (except for purely 
arithmetic mistakes). 


CAUTION: A standard deviation of .3 (numerically correct) can arise from this incorrect 
(.15+.16+.14+.15) 


JA 


Part (c) is essentially correct if the probability is correctly calculated using a mean which is 
either correct or carried from (b) as well as a standard deviation which is either correct or carried 
from (b). 





calculation: =0.3 


Part (c) is partially correct if: 


both the mean and standard deviation are correct or carried from (b), but the computed 
probability is incorrect 

OR 

the mean or standard deviation is incorrectly derived from (b) but the subsequent 
probability calculation is correct. 
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Question 3 (cont’d.) 
4 Complete Response 
All three parts essentially correct. 
3 Substantial Response 
Two parts essentially correct and one part partially correct. 


2 Developing Response 
Two parts essentially correct and no parts partially correct. 
OR 
One part essentially correct and two parts partially correct. 
OR 
One part essentially correct and one part partially correct. 
OR 
Three parts partially correct. 


1 Minimal Response 
One part essentially correct and zero parts partially correct. 
OR 
No parts essentially correct and two parts partially correct. 
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Question 4 
Solution 


Part (a): 
Predicted cost = 1136 + 14.673 (number of passenger seats) 


OR 


py =11364+14.673x where y = operating cost per hour 
and x = number of passenger seats 


Part (b): 
e The value of the correlation coefficient 


r=+¥V0.570 =0.755 (ris positive because the scatterplot shows a positive 
association) 
e The interpretation of correlation 


There is a moderate (or strong) positive linear relationship between operating 
costs per hour and number of passenger seats. 


OR 


Fifty-seven percent of the variability in operating cost per hour can be explained 
by a linear relationship between cost and number of passenger seats AND the 
relationship is positive. 


Part (c): 
No. The equation of the least-squares regression line is influenced by the three points in 
the upper right-hand corner and the two points in the lower left-hand corner of the 
scatterplot. The seven remaining points (with number of seats in the 250 to 350 range) 
would have a negative correlation. Hence, the slope of the recalculated least-squares 
regression line is negative. 


Scoring 
The student response should include the following elements: 
1. the correct equation of the least squares regression line with variables correctly 
defined; 

2. the correct value for the correlation coefficient; 
a correct interpretation of the given correlation coefficient; and 
4. acomplete explanation of why the given least-squares line would not be 

appropriate for describing the relationship over the restricted range. 


ies) 


Each element is scored as essentially correct (E), partially correct (P), or incorrect (I). 
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Question 4 (cont’d.) 


Part (a) addresses the first element. 


Element one is: 


essentially correct if the solution has the correct equation and variables are 
defined correctly. 


partially correct if only the equation is correct. 


incorrect if the equation is not stated correctly. 


Part (b) addresses the second and third elements. 


Element two is: 


essentially correct if the student’s solution states that r = 0.755. 





partially correct if the student’s solution only states that r= + 0.755. 


incorrect if the student states any other value of r including 
r = 0.726 (square root of R-Sq (adj)). 


Element three is: 


essentially correct if the student’s solution 


addresses, based on a correct understanding of the correlation coefficient, 
three or four of the following: 


OR 


type of relationship 
strength 

direction 

context 


states, based on a correct understanding of 7’: 


Note: 


that 57 percent of the variability in operating cost per hour can 
be explained by a linear relationship between cost and number of 
passenger seats 

AND 


that the relationship is positive. 


If the student gives a correct interpretation of r but then 


incorrectly explains 7’, this is considered a parallel solution 
and is incorrect. 
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Question 4 (cont’d.) 
partially correct if the student’s solution 


addresses exactly two of the following — type of relationship (linear), 
strength, direction, and context (based on a correct understanding of the 
correlation coefficient). 


OR 


only states that 57 percent of the variability in operating cost per hour 
can be explained by a linear relationship between cost and number of 
passenger seats (based on a correct understanding of r’) — BUT — does 
not state that the relationship is positive. 


NOTE: Element three may be scored essentially or partially correct if the student uses a 
reasonable r (between 0 and 1) or R-Sq (adj) value. 


Part (c) addresses the fourth element. 

Element four is essentially correct if the student’s solution 
states that the existing line is not a good fit for the remaining seven points and 
correctly explains that the restricted data has a negative correlation or the 
recalculated least-squares regression line has a negative slope. 

Element four is partially correct if the student’s solution 
explains why the existing line is not a good fit for the remaining seven points but 
does not communicate that the restricted data has a negative correlation or the 
recalculated least-squares regression line has a negative slope. 


OR 


removes fewer than the specified five points, but gives a correct interpretation of 
the effect on the correlation or slope of the least-squares regression line. 
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Question 4 (cont’d.) 


For elements 1 through 4, essentially correct responses count as one element and partially 
correct responses count as one-half of an element. 


4 Complete Response 
Four elements correct. 
3 Substantial Response 
Three elements correct. 
2 Developing Response 
Two elements correct. 
1 Minimal Response 
One element correct. 
If a paper is between two scores (for example, 2 1/2 elements) use a holistic approach to 


determine whether to score up or down depending on the strength of the response and quality 
of communication. 
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Question 5 
Solution 
Part (a): 
Possibilities include: 
1. Ps = proportion of early birds who recall dreams 


Pw = proportion of night owls who recall dreams 


Ho: pe-Ppn= 90 vs. Ha: pe-pn <0 OR Ho: pe=pn vs. Ha: pe<pn 
OR 

Pr = proportion of early birds who do not recall dreams 

Pn = proportion of night owls who do not recall dreams 


Ho: pe-pn=90 vs. H,: pe-pn>0 OR Ho: pge=pn vs. Ha: pe> pn 
NOTE: Either of these, BUT NOT BOTH, can be used as one of the possibilities 
for part (a). 
2. Ps = proportion of early birds who recall 5 or more dreams 


Pw = proportion of night owls who recall 5 or more dreams 


Ho: pe-pn=0 vs. Ha: pe-pn <0 OR Ho: pe=pn vs. Ha: pe<pn 
OR 

Pe = proportion of early birds who do not recall 5 or more dreams 

Pn = proportion of night owls who do not recall 5 or more dreams 


Ho: pe-pn=0 vs. Ha: pe-pn> 0 OR Ho: pe=pn vs. Ha: pe>pn 
NOTE: Either of these, BUT NOT BOTH, can be used as one of the possibilities 
for part (a). 
3, Mg = median number of dreams early birds recall 


Mn = median number of dreams night owls recall 


Ho: Mg -My=0 VS. H;,: Mg - My <0 OR Ho: Me = Mn VS. H:: Mg < Mn 


NOTE: 1. A complete response for part (a) requires two pairs of hypotheses. 
2. Hypotheses for a chi-square test of homogeneity are not correct 
since this is a one-sided test. 
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Question 5 (cont’d.) 
Part (b): 


Part 1: States a correct pair of hypotheses, identifies a correct test (by name or by formula) 
and checks appropriate requirements. 


HH; = mean number of dreams early birds recall 
Hy = mean number of dreams night owls recall 
Ho : Me = Hy Ho : Hz — Hy =9 


OR 


Xp —Xy —O0 
Two-sample f-test ¢- = 


2 2 
KY Ky 

eae ieee 
Ng Ny 


Requirements: 

1. Problem states that independent random samples were taken. 

2. Normal population distributions or large samples. Since these are not normal, we 
need to note that nz = 100 and ny = 100 are both large in order to perform the ¢-test. 


OR 
Xp -Xy —O0 
Two-sample z-test z= 
Ne Nn 
Requirements: 
1. Problem states that independent random samples were taken. 
2. Since ng = 100 and ny = 100 are both large, it is OK to perform the 
approximate z-test. 
OR 
Pooled t-test t= eee where 8% is the pooled variance. 
Pn 7 ny 
E Mn 
Requirements: 


1. Problem states that independent random samples were taken. 

2. Normal population distributions or large samples. Since these are not normal, we 
need to note that n; = 100 and ny = 100 are both large in order to perform the f test. 

3. The sample standard deviations s; = 6.94 and sy = 5.88 are reasonably close, so it 


is OK to assume that the two population variances are equal, i.e. Gx = Gy ; 
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Question 5 (cont’d.) 


Part 2: Correct mechanics, including the value of the test statistic, df, and P-value (or 
rejection region) 








pees ges 7.26 —9.55 ~ 959 

See ae (6.94)? F (5.88)° 

ng my \ 100. 100 
So, t= -2.52 df= 192 P-value = 0.006 
e It is OK to use conservative df of 99. 
e Using ¢-tables: P-value < 0.01 
e Using calculator: t= —2.517578, P-value = 0.006304, df = 192.799 
e §=6Using z-test: P-value = 0.005908 
e =Using z-table: 0.0059 < P-value < 0.0060 
e ©The pooled f-test results in the same value of ¢, but a df of 198. 


Part 3: Stating a correct conclusion in the context of the problem, using the result of the 
statistical test. 


Because the P-value is small (or less than an & selected and stated by the student), reject Ho. 
There is convincing evidence that the mean number of dreams night owls recall is greater for 
than the mean number of dreams early birds recall. 
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Question 5 (cont’d.) 
Scoring 


Part (a): 
Essentially correct (E) if two distinct pairs of hypotheses are given and the parameters in the 
hypotheses are defined. 


Partially correct (P) if only one pair of correct hypotheses is given and the parameters in these 
hypotheses are defined or if two “approved” pairs of hypotheses are given but the parameters are 
poorly defined. 


Part (b): 

Each of the 3 parts of the hypothesis test is scored either as correct (E) or incorrect (I). 

e Because the hypotheses are given in the statement of part (a), they need not be restated here. 
However, if wrong hypotheses are stated, then part 1 is scored as incorrect. 

e Because the problem states that samples are random, it is OK if a student doesn’t repeat this. 

e Some reference to both samples being large is essential. 

e For the pooled t- test, some comment on the reasonableness of such an assumption is necessary. It 
is not sufficient just to say that population variances are equal. 


4 Complete Response 
Part (a) essentially correct and all three parts of the hypothesis test correct. 


3 Substantial Response 
Part (a) essentially correct and two parts of the hypothesis test correct. 
OR 
Part (a) partially correct or incorrect and three parts of the hypothesis test correct. 


Part (a) partially correct and two parts of the hypothesis test correct may be scored either as a 
2 or a3, depending on the overall strength of the paper. 


2 Developing Response 


Part (a) essentially correct and one part of the hypothesis test correct. 
OR 
Part (a) incorrect and two parts of the hypothesis test correct. 


Part (a) partially correct and two parts of the hypothesis test correct may be scored either as a 
2 or a3, depending on the overall strength of the paper. 


1 Minimal Response 


Part (a) essentially correct. 

OR 

Part (a) partially correct and zero or one parts of the hypothesis test correct. 
OR 

Part (a) is incorrect and one part of the hypothesis test correct. 
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Question 6 
Solution 
Part (a): 


P = proportion of students at this university who would respond S to the question, “Do you prefer 
S or F?” 


Large sample confidence interval for a population proportion. 


pti96,| 20°) 


State and Check Assumptions and Conditions: 


Simple random sample (given in the problem stem—need not be mentioned in solution). Also 
need large sample with np>5 and n(1- p)=5. Here, 


np =185 n(1— p)=139 are both greater than 5 (or 10) 


Or ip 3 4h PnP) is entirely in the interval (0,1). 


We assume the university has at least 10(324) = 3240 students (N 2107). 


Calculations: 


p £1.96 pnp). 0.571 + 1.96 |) = 0.571+ 0.054 = (0.517, 0.625) 


Calculator solution: (0.5171, 0.62488). The procedure is specified in the stem, but students still need 
to check assumptions and conditions. 


Interpretation: 


Based on this sample, we can be 95 percent confident that the proportion of students at this university 
who would respond S to the question, “Do you prefer S or F?” is between 0.517 and 0.625. 

OR 

We have 95 percent confidence that the interval (0.5171, 0.62488) captures the proportion of 
students who would respond “S” to the question, “Do you prefer S or F?” 
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Question 6 (cont’d.) 
Part (b): 
Meaning of Confidence Level: 
In repeated sampling, 95 percent of the intervals produced using this method will contain the 
proportion of students at this university who would respond S to the question “Do you prefer S or F?” 
Part (c): 
Approach 1: Hypothesis Test — Two Proportion Z- test 


States a Correct Pair of Hypotheses: 


Ho: py) — Pz =9 
H:: Pi — P2 x 0 
where 


P| = proportion of students at this university who would respond S with the original 


question wording 
P>= proportion of students at this university who would respond S with the revised 


question wording 
Note: A one-sided test is incorrect. 
Name Test and State and Check Assumptions and Conditions: 
Identifies a correct test (by name or by formula), and checks appropriate assumptions. 
Two sample z-test for proportions 


Note: Problem states that samples are random samples, so this does not need to be 
addressed in the assumptions. 


Large samples: 1p, =185;  m(1—p,)=139; mp) =68; ny (1— po) =88 
All are greater than 5 (or 10), so the sample sizes are large enough. 
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Question 6 (cont’d.) 
Calculations: 


Correct mechanics, including P-value or rejection region (except for minor arithmetic 
errors) 


For two sample proportion z-test: 














5 — 185 _ 5, = 68 _ 
Pi = 354 = 0-571 Pa = [5g = 0.436 
~ 185+ 68 — 253 _ 
P 994-4156 480 
Pi- Po 0.571 — 0.436 0.135 
SST ———E——EEEE———EEE = 27 
p(l-p) . pp) — |(0.527)(0.473) , (0.527)(0.473) 9.0487 
ee ee | ee ee ee 
ny Ny 324 156 


P-value = 2(.0028) = .0056 from tables 
(Calculator: z = 2.776554085, P-value = .0054939656) 


If the proportions are not pooled, then z = 2.795 and p = 0.00518. 


Conclusion: 


Since the P-value (0.0055) is so small, we reject the null hypothesis that the proportions 
of this university’s students who would respond S to the two survey questions are equal. 
We believe the order in which the choices are given affects the students’ response. 


Stating a correct conclusion in the context of the problem, using the result of the 
statistical test (i.e., linking the conclusion to the result of the hypothesis test). 


If both an a and a P-value are given, the linkage is implied. If no a is given, the solution 
must be explicit about the linkage by giving a correct interpretation of the P-value or 
explaining how the conclusion follows from the P-value. 


If the P-value in part 3 is incorrect but the conclusion is consistent with the computed P- 
value, part 4 should be considered as correct. 
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Question 6 (cont’d.) 
Approach 2: Hypothesis Test — Chi-square Test for Homogeneity 
States a correct pair of hypotheses: 
Ho: population response proportions are the same for the two question wordings 
H,: population response proportions are not the same for the two question 


wordings 


NOTE: Although the computations are the same, stating the hypotheses in terms of 
independence is not correct. 


Name Test and State and Check Assumptions and Conditions: 
Identifies a correct test (by name or by formula), and checks appropriate assumptions. 


Chi-square test for homogeneity 


Observed counts 





S F row total 
Original wording 185 139 324 
Revised wording 68 88 156 
Column total 253 227 480 


Expected counts 





Original wording 170.775 153.225 
Revised wording 82.225 73.775 


NOTE: Problem states that samples are random samples, so this does not need to be 
addressed in the assumptions. All expected counts are greater than five, so the sample 
sizes are large enough. 


Calculations: 


Correct mechanics, including P-value or rejection region (except for minor arithmetic 
errors) 


For chi-square test: 





3 ay Oey _ (185 — 170.775)" (88 — 73.775) 
r= ; = ae 


a = 7.7092 
170.775 73.775 eee 


df=1 fromtables 0.005 < P-value < 0.01 


(Calculator: P-value = 0.0054938481) 


Copyright © 2002 by College Entrance Examination Board. All rights reserved. 
Advanced Placement Program and AP are registered trademarks of the College Entrance Examination Board. 


21 


AP® STATISTICS 
2002 SCORING GUIDELINES 


Question 6 (cont’d.) 
Conclusion: 
Since the P-value is so small, we reject the null hypothesis that the proportions of 
students at this university who would respond S are the same for the two survey 
questions. We believe the order in which the choices are given affects the students’ 


response. 


Stating a correct conclusion in the context of the problem, using the result of the 
statistical test (i.e., linking the conclusion to the result of the hypothesis test). 


If both an a and a P-value are given, the linkage is implied. If no a is given, the solution 
must be explicit about the linkage by giving a correct interpretation of the P-value or 
explaining how the conclusion follows from the P-value. 
If the P-value in part 3 is incorrect but the conclusion is consistent with the computed P- 
value, part 4 should be considered as correct. 

Approach 3: Two sample confidence interval 
Name Test and State and Check Assumptions and Conditions: 


Two-sample confidence interval. 


Problem states that samples are random samples, so this does not need to be 
addressed in the assumptions. 


Large samples: 
mp, =185; m(l— p,)=139; mp, =68; ny (l— pr) =88 


All are larger than 5 (or 10), so the sample sizes are large enough. 


en Hh « |pUd-Z Dp, (1— p 
(B,- po) +z P= Pi) ra P2(1— Pz) 
nN Ny 


90 percent CI: (0.05565, 0.21453) 
95 percent CI: (0.04044, 0.22974) 
99 percent CI: (0.01069, 0.25949) 


Calculations: 
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Question 6 (cont’d.) 
Conclusion: 


Since the confidence interval does not include 0, there is evidence that the proportions of 
students at this university who respond S is not the same for the two question wordings. 


Approach 4: (This approach can score at most partially correct for part (c)) 


One sample confidence interval for p,. 


All checks and assumptions must be included (same as in section (a)). 
95 percent CI: (0.35808, 0.51371) 


Since this interval does not overlap with the interval computed in part (a), (0.517,0.625), 


conclude that the proportions of this university’s students who would respond S is not the 
same for the two question wordings. 


Part (d): 


If the sample sizes had been equal, it would be reasonable to combine the data from the 
two samples by pooling, which would be equivalent to averaging the two proportions in 
this case. But, since the wording of the question makes a difference, and more people 
were asked the original version than were asked the revised version, we cannot just pool. 


OR 


It is only reasonable to pool estimates if they are estimating the same population 
parameter. Here the proportion who would respond S differs with the survey question so 
the estimates should not be pooled. 


Approach 1: 


One reasonable approach would be to scale sample 2 up to a sample size of 324 while 
maintaining the same sample proportion. To do this, the number of S's in 

sample 2 would be multiplied by a factor of 2.076923 (It is OK if the student uses a 
factor of 2 for simplicity). This would result in two samples of sizes 324 with 

185 S's in sample | and 141 S's in sample 2. This would result in an estimate of those 
who prefer S of 


fo ASS IAL - 396: 
Pa page = GAR 
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Question 6 (cont’d.) 


Note: A comparable approach would be to scale sample 1 down to a sample size of 156 
by using a multiplier of 0.481481 to obtain 


. 89 + 68 _ 157 
p= Se BD 





= 0.503 This is very close to 0.5. 


Approach 2: 


The approach described above is equivalent to just averaging the two proportions, and so 
averaging the two given proportions is also an acceptable approach. 


185 68 


Ts. + — 
p - 324 ; 156 _ 0.571 : 0436: ih 


Note: A weighted average of the two proportions (with weights proportional to sample 
size) is equivalent to the given pooled value. Ifthe student rejects the pooled value and 
proposes a weighted average of the two sample proportions as an alternative, part (d) is 
incorrect. 


Scoring for Question 6 
Parts (a) and (b) should be read together. 


Part (a) is scored as essentially correct (E) if the assumptions are checked, the interval is 
computed correctly (except for minor arithmetic errors), and a correct interpretation of 
the interval is given. 


It is partially correct (P) if the interval is computed correctly (except for minor arithmetic 
errors) but either the assumptions or the interpretation is not correct. 


Otherwise, part (a) is scored as incorrect (I). 


Part (b) is scored as either essentially correct (E) or incorrect (I). It is not possible to 
score partially correct on this part. 


Part (c) is scored as essentially correct (E) if all four parts of the hypothesis test are 
correct. It is scored as partially correct (P) if two or three of the components of the test 
are correct. Otherwise, it is scored as incorrect (1). 


Part (d) is scored as essentially correct (E) if the student produces a reasonable estimate 
that takes the different sample sizes into account, the explanation is correct and 
communication is good. 
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Question 6 (cont’d.) 


It is partially correct (P) if 


a reasonable estimate is produced but the explanation is incorrect or weak 
OR 

a good explanation of why not to use the pooled estimate but no reasonable 
alternative is given 


For parts (a), (b), and (c), essentially correct responses count as | part and partially correct 
responses count as part. 


4 


Complete Response 
Four parts correct. 


Substantial Response 
Three parts correct. 


Developing Response 
Two parts correct. 


Minimal Response 
One part correct. 


If a paper is between two scores (for example, 2 2 parts) use a holistic approach to determine 
whether to score up or down depending on the strength of the response and communication. 


Note: If the paper is between two scores and (a) or (c) has the interpretation correct, then round 
up. If neither is correct, round down. 
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